智能论文笔记

Kernel Biclustering algorithm in Hilbert Spaces

Marcos Matabuena , J. C Vidal , Oscar Hernan Madrid Padilla , Dino Sejdinovic

分类： (统计)机器学习

2022-08-07

双簇算法分区数据并同时协变量，提供了几个领域的新见解，例如分析基因表达以发现新的生物学功能。本文使用能量距离（ED）和最大平均差异（MMD）的概念在抽象空间中开发了一种新的无模型双簇算法 - 能够处理复杂数据（例如曲线或图形）的概率分布之间的两个距离。所提出的方法比大多数现有文献方法都可以学习更多的通用和复杂的群集形状，这些方法通常着重于检测均值和方差差异。尽管我们的方法的两次簇配置受到限制，以在基准和协变量级别创建不相交结构，但结果是竞争性的。我们的结果与最佳场景中的最新方法相似，假设有适当的内核选择，当群集差异集中在高阶矩中时，它们的表现优于它们。该模型的性能已在涉及模拟和现实世界数据集的几种情况下进行了测试。最后，使用最佳运输理论的一些工具确定了新的理论一致性结果。

translated by 谷歌翻译

Neural interval-censored Cox regression with feature selection

Carlos García Meixide , Marcos Matabuena , Michael R. Kosorok

分类： (统计)机器学习 | 机器学习

2022-06-14

1972年出现了经典的COX模型，促进了如何使用生物医学中的事实分析来量化患者预后的突破。该模型最有用的特征之一是分析中变量的解释性。但是，这是以引入有关回归模型功能形式的强有力的假设的代价。为了打破这一差距，本文旨在利用新的套索神经网络在间隔进行审查的设置中利用经典COX模型的解释性优势，该网络同时选择最相关的变量，同时量化预测因子和生存时间之间的非线性关系。在广泛的模拟研究中，新方法的增益在经验上进行了说明，其中涉及线性和非线性地面依赖性的示例。我们还证明了我们在NHANES 2003-2006波的生理，临床和加速度计分析中的策略表现，以预测体育活动对患者存活的影响。我们的方法的表现优于使用传统Cox模型的文献中的先前结果。

translated by 谷歌翻译

SOLD: Sinhala Offensive Language Dataset

Tharindu Ranasinghe , Isuri Anuradha , Damith Premasiri , Kanishka Silva , Hansi Hettiarachchi , Lasitha Uyangodage , Marcos Zampieri

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-01

The widespread of offensive content online, such as hate speech and cyber-bullying, is a global phenomenon. This has sparked interest in the artificial intelligence (AI) and natural language processing (NLP) communities, motivating the development of various systems trained to detect potentially harmful content automatically. These systems require annotated datasets to train the machine learning (ML) models. However, with a few notable exceptions, most datasets on this topic have dealt with English and a few other high-resource languages. As a result, the research in offensive language identification has been limited to these languages. This paper addresses this gap by tackling offensive language identification in Sinhala, a low-resource Indo-Aryan language spoken by over 17 million people in Sri Lanka. We introduce the Sinhala Offensive Language Dataset (SOLD) and present multiple experiments on this dataset. SOLD is a manually annotated dataset containing 10,000 posts from Twitter annotated as offensive and not offensive at both sentence-level and token-level, improving the explainability of the ML models. SOLD is the first large publicly available offensive language dataset compiled for Sinhala. We also introduce SemiSOLD, a larger dataset containing more than 145,000 Sinhala tweets, annotated following a semi-supervised approach.

translated by 谷歌翻译

Weakly-supervised detection of AMD-related lesions in color fundus images using explainable deep learning

José Morano , Álvaro S. Hervella , José Rouco , Jorge Novo , José I. Fernández-Vigo , Marcos Ortega

分类：计算机视觉

2022-12-01

Age-related macular degeneration (AMD) is a degenerative disorder affecting the macula, a key area of the retina for visual acuity. Nowadays, it is the most frequent cause of blindness in developed countries. Although some promising treatments have been developed, their effectiveness is low in advanced stages. This emphasizes the importance of large-scale screening programs. Nevertheless, implementing such programs for AMD is usually unfeasible, since the population at risk is large and the diagnosis is challenging. All this motivates the development of automatic methods. In this sense, several works have achieved positive results for AMD diagnosis using convolutional neural networks (CNNs). However, none incorporates explainability mechanisms, which limits their use in clinical practice. In that regard, we propose an explainable deep learning approach for the diagnosis of AMD via the joint identification of its associated retinal lesions. In our proposal, a CNN is trained end-to-end for the joint task using image-level labels. The provided lesion information is of clinical interest, as it allows to assess the developmental stage of AMD. Additionally, the approach allows to explain the diagnosis from the identified lesions. This is possible thanks to the use of a CNN with a custom setting that links the lesions and the diagnosis. Furthermore, the proposed setting also allows to obtain coarse lesion segmentation maps in a weakly-supervised way, further improving the explainability. The training data for the approach can be obtained without much extra work by clinicians. The experiments conducted demonstrate that our approach can identify AMD and its associated lesions satisfactorily, while providing adequate coarse segmentation maps for most common lesions.

translated by 谷歌翻译

Graph Convolutional Network for Multi-Target Multi-Camera Vehicle Tracking

Elena Luna , Juan Carlos San Miguel , José María Martínez , Marcos Escudero-Viñolo

分类：计算机视觉

2022-11-28

This letter focuses on the task of Multi-Target Multi-Camera vehicle tracking. We propose to associate single-camera trajectories into multi-camera global trajectories by training a Graph Convolutional Network. Our approach simultaneously processes all cameras providing a global solution, and it is also robust to large cameras unsynchronizations. Furthermore, we design a new loss function to deal with class imbalance. Our proposal outperforms the related work showing better generalization and without requiring ad-hoc manual annotations or thresholds, unlike compared approaches.

translated by 谷歌翻译

Efficient Single-Image Depth Estimation on Mobile Devices, Mobile AI & AIM 2022 Challenge: Report

Andrey Ignatov , Grigory Malivenko , Radu Timofte , Lukasz Treszczotko , Xin Chang , Piotr Ksiazek , Michal Lopuszynski , Maciej Pioro , Rafal Rudnicki , Maciej Smyl

分类：计算机视觉

2022-11-07

Various depth estimation models are now widely used on many mobile and IoT devices for image segmentation, bokeh effect rendering, object tracking and many other mobile tasks. Thus, it is very crucial to have efficient and accurate depth estimation models that can run fast on low-power mobile chipsets. In this Mobile AI challenge, the target was to develop deep learning-based single image depth estimation solutions that can show a real-time performance on IoT platforms and smartphones. For this, the participants used a large-scale RGB-to-depth dataset that was collected with the ZED stereo camera capable to generated depth maps for objects located at up to 50 meters. The runtime of all models was evaluated on the Raspberry Pi 4 platform, where the developed solutions were able to generate VGA resolution depth maps at up to 27 FPS while achieving high fidelity results. All models developed in the challenge are also compatible with any Android or Linux-based mobile devices, their detailed description is provided in this paper.

translated by 谷歌翻译

Learned Smartphone ISP on Mobile GPUs with Deep Learning, Mobile AI & AIM 2022 Challenge: Report

Andrey Ignatov , Radu Timofte , Shuai Liu , Chaoyu Feng , Furui Bai , Xiaotao Wang , Lei Lei , Ziyao Yi , Yan Xiang , Zibin Liu

分类：计算机视觉

2022-11-07

The role of mobile cameras increased dramatically over the past few years, leading to more and more research in automatic image quality enhancement and RAW photo processing. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based image signal processing (ISP) pipeline replacing the standard mobile ISPs that can run on modern smartphone GPUs using TensorFlow Lite. The participants were provided with a large-scale Fujifilm UltraISP dataset consisting of thousands of paired photos captured with a normal mobile camera sensor and a professional 102MP medium-format FujiFilm GFX100 camera. The runtime of the resulting models was evaluated on the Snapdragon's 8 Gen 1 GPU that provides excellent acceleration results for the majority of common deep learning ops. The proposed solutions are compatible with all recent mobile GPUs, being able to process Full HD photos in less than 20-50 milliseconds while achieving high fidelity results. A detailed description of all models developed in this challenge is provided in this paper.

translated by 谷歌翻译

scikit-fda: A Python Package for Functional Data Analysis

Carlos Ramos-Carreño , José Luis Torrecilla , Miguel Carbajo-Berrocal , Pablo Marcos , Alberto Suárez

分类：机器学习 | (统计)机器学习

2022-11-04

The library scikit-fda is a Python package for Functional Data Analysis (FDA). It provides a comprehensive set of tools for representation, preprocessing, and exploratory analysis of functional data. The library is built upon and integrated in Python's scientific ecosystem. In particular, it conforms to the scikit-learn application programming interface so as to take advantage of the functionality for machine learning provided by this package: pipelines, model selection, and hyperparameter tuning, among others. The scikit-fda package has been released as free and open-source software under a 3-Clause BSD license and is open to contributions from the FDA community. The library's extensive documentation includes step-by-step tutorials and detailed examples of use.

translated by 谷歌翻译

Exploring Attention GAN for Vehicle Motion Prediction

Carlos Gómez-Huélamo , Marcos V. Conde , Miguel Ortiz , Santiago Montiel , Rafael Barea , Luis M. Bergasa

分类：计算机视觉 | 人工智能 | 机器人

2022-09-26

安全可靠的自主驾驶堆栈（AD）的设计是我们时代最具挑战性的任务之一。预计这些广告将在具有完全自主权的高度动态环境中驱动，并且比人类更大的可靠性。从这个意义上讲，要高效，安全地浏览任意复杂的流量情景，广告必须具有预测周围参与者的未来轨迹的能力。当前的最新模型通常基于复发，图形和卷积网络，在车辆预测的背景下取得了明显的结果。在本文中，我们探讨了在生成模型进行运动预测中注意力的影响，考虑到物理和社会环境以计算最合理的轨迹。我们首先使用LSTM网络对过去的轨迹进行编码，该网络是计算社会背景的多头自我发言模块的输入。另一方面，我们制定了一个加权插值来计算最后一个观测框中的速度和方向，以便计算可接受的目标点，从HDMAP信息的可驱动的HDMAP信息中提取，这代表了我们的物理环境。最后，我们的发电机的输入是从多元正态分布采样的白噪声矢量，而社会和物理环境则是其条件，以预测可行的轨迹。我们使用Argoverse运动预测基准1.1验证我们的方法，从而实现竞争性的单峰结果。

translated by 谷歌翻译

Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restoration

Marcos V. Conde , Ui-Jin Choi , Maxime Burchi , Radu Timofte

分类：计算机视觉

2022-09-22

压缩在通过限制系统（例如流媒体服务，虚拟现实或视频游戏）等系统的有效传输和存储图像和视频中起着重要作用。但是，不可避免地会导致伪影和原始信息的丢失，这可能会严重降低视觉质量。由于这些原因，压缩图像的质量增强已成为流行的研究主题。尽管大多数最先进的图像恢复方法基于卷积神经网络，但基于Swinir等其他基于变压器的方法在这些任务上表现出令人印象深刻的性能。在本文中，我们探索了新型的Swin Transformer V2，以改善图像超分辨率的Swinir，尤其是压缩输入方案。使用这种方法，我们可以解决训练变压器视觉模型中的主要问题，例如训练不稳定性，预训练和微调之间的分辨率差距以及数据饥饿。我们对三个代表性任务进行实验：JPEG压缩伪像去除，图像超分辨率（经典和轻巧）以及压缩的图像超分辨率。实验结果表明，我们的方法SWIN2SR可以改善SWINIR的训练收敛性和性能，并且是“ AIM 2022挑战压缩图像和视频的超分辨率”的前5个解决方案。

translated by 谷歌翻译